Interview Analysis Code

Overview

Code examples for sentiment analysis of football manager interviews, supporting both R and Python workflows.

Related to: Week 5-6 (Text as Data)

API Setup

api-key.R

Template for setting up OpenAI API credentials in R.

# Set your API key
Sys.setenv(OPENAI_API_KEY = "sk-....")

Important: Copy to my-openai-api-key.R and add to .gitignore

R Scripts

sentiment-analysis.R

Complete sentiment analysis workflow using OpenAI API.

Features: - API key setup and validation - Batch processing with progress tracking - Sentiment classification function (-2 to +2 scale) - Error handling and retry logic - Comparison between multiple API runs

Key functions: - classify_text(): Sends text to OpenAI for sentiment rating - Handles API errors with exponential backoff - Saves results to CSV for further analysis

domain_lexicon.r

Creates football-specific sentiment lexicon.

Output: domain_lexicon.csv with: - 100+ football-specific terms - Positive terms (goals, win, excellent): scores 1.2-1.9 - Negative terms (lose, poor, mistake): scores -1.9 to -1.2 - Neutral terms (match, team, play): score 0

Python Scripts

classify_manager_sentiment.py

Python equivalent of R sentiment analysis.

Features: - OpenAI API integration using openai library - Logging and progress tracking with tqdm - Robust error handling - Same rating guidelines as R version

Dependencies:

pip install openai pandas tqdm python-dotenv

manager_sentiment_api.md

Documentation and code template for Python API usage.

Usage Notes

API Key Management

  • Store keys in environment variables
  • Never commit keys to version control
  • Budget ~$5 for course exercises

Rate Limiting

  • Both scripts include retry logic
  • Exponential backoff for failed requests
  • Monitor usage through OpenAI dashboard

Reproducibility

  • Set temperature=0 for consistent results
  • Save intermediate results
  • Document model versions used

Output Files

Scripts generate: - manager_sentiment_results.csv: Individual text ratings - sentiment_comparison.csv: Comparison between runs - classification.log: Detailed processing logs

Integration with Course Data

These scripts work with: - /data/interviews/interview-texts-only.xlsx (input) - /data/interviews/domain_lexicon.csv (lexicon) - Student rating files for comparison analysis